SVG Tutorial for spatialHeatmap

Maintainer: Jianhai Zhang (jzhan067@ucr.edu; zhang.jianhai@hotmail.com)

1 Summary

The R/Bioconductor package spatialHeatmap is designed for intuitive visualisation of large-scale data as long as a pair of configured data matrix and SVG image are provided. This tutorial is specifically devised for how to make an SVG image and configure it with the data matrix step by step.

To make a custom SVG image, a png image where regions should have clear contours, a data matrix, the SVG editor Inkscape, and optionally the image editor GIMP are required. The png image is the template while the data matrix is used to colour different regions in the SVG image. Inkscape is used to associate the SVG image with the data matrix and GIMP is optionally used to automatically extract polygons for the SVG image.

In the following, the tutorial is given with a pair of configured gene expression matrix and SVG image of root tissues. All the files used in this tutorial can be downloaded here (download an SVG image: hover over the image, right click, and select “Save image as…”; download a PNG image: click the image, click “Download”, right click, and select “Save image as…”; download a TXT file: click the file, click “Raw”, right click, and select “Save as…”.).

2 Procedure

2.1 Gene Expression Matrix

The gene expression matrix should be normalised and filtered before downstream processing. In this tutorial the normalisation process is not covered. The function filter.data (Morgan et al. 2018; Dowle and Srinivasan 2018; R Core Team 2018) has two filter arguments pOA and CV, corresponding to pOverA and cv in the package “genefilter” (Gentleman et al. 2018), respectively.

In the gene expression matrix, the row and column names should be gene IDs and sample/conditions respectively. The sample/condition names MUST be fomatted this way: a sample name is followed by double underscore then the condition, such as “epidermis__140mM_1h" in Table 1 (Geng et al. 2013), where epidermis is the sample and 140mM_1h is the condition. In the column names of sample/condition, only letters, digits, single underscore, single space, or dots are allowed. Not all samples in the matrix necessarily need to be present in the SVG image, vice versa. Only samples present in the SVG image are recognised and coloured.

The expression matrix is stored as an “SummarizedExperiment” object. Metadata of genes and sample/conditions can be optionally added. Refer to the R package “SummarizedExperiment” for more details (Morgan et al. 2018).

  library(spatialHeatmap); library(data.table); library(SummarizedExperiment)
  # Creat the "SummarizedExperiment" class.
  ## The expression matrix, where the row and column names should be gene IDs and sample/conditions, respectively.
  data.path <- system.file("extdata/example", "root_expr_row_gen.txt", package = "spatialHeatmap")
  expr <- fread(data.path, sep='\t', header=TRUE, fill=TRUE)
  col.na <- colnames(expr)[-ncol(expr)]; row.na <- as.data.frame(expr[, 1])[, 1]
  expr <- as.matrix(as.data.frame(expr, stringsAsFactors=FALSE)[, -1])
  rownames(expr) <- row.na; colnames(expr) <- col.na
  con.path <- system.file("extdata/example", "root_con.txt", package = "spatialHeatmap") 
  ## Condition is a single column data frame.
  con <- read.table(con.path, header=TRUE, row.names=NULL, sep='\t', stringsAsFactors=FALSE)
  ann.path <- system.file("extdata/example", "root_ann.txt", package = "spatialHeatmap")
  ## Gene annotation is a single column data frame.
  ann <- read.table(ann.path, header=TRUE, row.names=1, sep='\t', stringsAsFactors=FALSE)
  ## The expression matrix, gene annotation, and condition are stored in a "SummarizedExperiment" object. Gene annotation and condition are optional.
  expr <- SummarizedExperiment(assays=list(expr=expr), rowData=ann, colData=con)

  # Filter genes. In "pOA", genes with expression value A >= 1 in at least p=0.03 (3%) of all samples are retained; in "CV", genes with coefficient of variance (cv) between 0.1 and 10000 are retained, where the upper limit is set to very high (10000) so as to keep all genes with cv over 0.1. 
  exp <- filter.data(data=expr, pOA=c(1, 0.03), CV=c(0.1, 10000), dir=NULL)

  # Get the filtered matrix. "filter.data" returns a "SummarizedExperiment" object.
  df <- assay(exp)

Table 1. Gene expression matrix. Rows and columns are genes and sample/conditions respectively.
epidermis__standard_1h epidermis__140mM_1h epidermis__140mM_3h epidermis__140mM_8h
PSAC 2.944807 2.457910 2.862155 2.313841
NDHG 4.243482 4.072965 4.154074 4.935179
PETG 4.830398 5.516260 5.390418 5.372507

2.2 Make SVG Images

2.2.1 Make Blank SVG Images by Drawing

If the contour in the png image is not clear, GIMP can generate low-quality SVG images, so in this case one can draw the blank SVG image with Inkscape by using the png as a template. Below is an example of drawing only two polygons.

  1. Draw polygons

    Open the root png image (Mustroph et al. 2009) in Inkscape. The image can be zoomed by press “-” or “+” on the keyboard. Select the “Draw freeahnd lines (F6)” at the left tool bar. Left click once at the first corner of the polygon, move to the second corner and double left click, and so on. Lastly, when drawing the last line click at the first corner to seal the polygon.

    )

  2. Align polygons

    Select “Edit path by nodes (F2)” from the left tool bar. Draw a large rectangle to select the whole sealed polygon and draw another large rectangle to select all nodes.

    Click the “Make selected nodes smooth” in the top tool bar. Drag the edges and handles to make the sealed polygon aligned with the template, then the first polygon is finished.

  3. Make polygons with shapes

    Alternatively, the polygons can be made with rectangles. Click “Create rectangles and squares (F4)” at the left tool bar and draw a rectangle in the second template polygon. Select the rectangle and click “Object to Path” under the “Path” tab at the top, then the rectangle becomes an SVG path, which can be edited. Switch cursor to “Select and Transform Objects (F1)” and rotate the rectangle as expected.

    Switch the cursor to “Edit path by nodes (F2)” and select all nodes of the rectangle. Click “Make selected nodes corner” at the top tool bar. Drag the edges and handles to overlay the rectangle path on the template, then the second polygon is finished.

In this root image there are many polygons, so it takes too much time to draw them individually. However, GIMP can be used to extract the polygons automatically.

2.2.2 Make Blank SVG Images with GIMP

If the png image contains many polygons and their contours are clear (e.g. the root image), GIMP can be used to automatically extract the polygons.

  1. Open the root png image with GIMP. There are five coloured tissues (skyblue, green, red, yellow, blue). The skyblue is not present in the gene matrix while the green, red, yellow, blue correspond to epidermis, cortex, endodermis, stele in the gene matrix respectively. Enable the “Paths” panel under the “Windows” tab at the top. Right click and select “By Color” under “Select”.

  2. Click on the skyblue tissue, then right click, select “To Path” under “Select”, and the paths of all skyblue tissues are extracted in the “Paths” panel.

  3. Similarly, exrtact paths of other tissues. In the “Paths” panel set paths visible by enabling the eye symbol, then right click and merge visible paths. Right click and export merged paths, which is a blank SVG image.

2.2.3 Make Blank SVG Images with GIMP and Drawing

The drawing method creates accurate SVG images but it is time-consuming, while the GIMP method is faster but it can generate fused polygons. In this tutorial, the root png image has well-separated polygons and clear contours, so GIMP can produce accurate SVG images. Otherwise the resulting SVG images would have mixed and noise polygons. Considering the pros and cons of two methods, the good practice is to first use GIMP to extract polygons then use drawing method to refine them. Below is an example of an SVG image with fused polygons, which is generated by GIMP.

  1. Place all SVG paths inside a layer

    Open the blank fused SVG image in Inkscape (the image can be zoomed by press “-” or “+” on the keyboard). Under the “Layers” tab at the top click “Layers”, the “Layers” panel will come out on the right, then click “+” at the bottom left corner of the panel to add “Layer 1”.

    Draw a rectangle over the fused SVG graph and cut. Then click on “Layer 1” and paste the fused SVG into “Layer 1” (make sure the “Layer 1” is unlocked by refering to the lock symbol). Open the “XML Editor” from the “Edit” tab at the top, if “<svg:path id=”path 77“>” is under “<svg:g id=”layer1" inkscape:label=“Layer 1”>“, then the fused SVG is inside”Layer 1“.

  2. Refine and modify the SVG image

    Draw a ractangle over the image, click “Break Apart” under the “Path” tab, then select the outer noisy rectangle by clicking on its edge. Press “delete” on the keyboard to delete it.

Click the edge of the large fused polygon and move. Use the [drawing method](#draw) to make new polygons (blue) with the fused ones as templates. Delete the fused polygons and move back the new polygons to make the tissue complete. <p/>

If a large fused polygon needs to be separated, one can use the eraser tool. Drag the fused polygon away from the tissue, select the eraser tool from the left tool bar, then use the eraser to cut the fused polygon into three independent polygons. Select the cut polygons and click “Break Apart” under the “Path” tab at the top, then the three polygons are separated. Place back the three polygons to make the tissue complete.

  1. Fill and stroke

    Under “Object” tab at the top, select “Fill and Stroke…”, then the “Fill and Stroke (Shift+Ctrl+F)” panel will come out on the right. Select all polygons by drawing a large rectangle over them.

    Under the “Stroke paint” tab in the fill and stroke panel, select “Flat color”. Under the “Stroke style” tab, set the stroke width, e.g.: 1.5 px.

    Under the “Fill” tab, click “No paint” to get a blank SVG image, which is ready to use in next section.

2.3 Configure the SVG Image with Gene Matrix

In the SVG image, each polygon has a unique ID. To plot spatial heatmaps, these IDs should be exactly replaced with sample names in the gene matrix. No matter how the blank SVG image is created, it should be placed inside a layer before start the following steps.

  1. Group same tissues

    If multiple polygons belong to the same tissue type, they should be grouped together. The example of grouping epidermis is given below. Open “XML Editor…” under the “Edit” tab at the top, then the “XML Editor (Shift+Ctrl+X)” panel comes out on the right. Click all the epidermis polygons while pressing the “Shift” key, right click, and select “Group”. A group should not contain another group.

  1. Associate polygons with samples

    The epidermis group <svg:g id=“987”> shows up in the XML panel. Click “id”, change “g987” to “epidermis”, and click “Set”, then the new group id is set.

In the “Fill and Stroke (Shift+Ctrl+F)” panel, select “Flat color” under the “Fill” tag, then specify a color for epidermis.

Group other tissues, set ids and colours.

  1. Polygon orders

    The polygons are stacked over each other according to their orders in the “XML Editor”, so the first polygon might be invisible because it could be covered by the second, third, and so on. For instance, in the brain SVG image (anatomybodysystem.com 2017; epilepsyresearch 2017), the grey outline polygon (the path “rect5480”) is the first one, and partially covered by other polygons. Therefore, users should drag and organise the paths in expected order.

2.4 Add Text to Label Tissues

Users can add text to label tissues. Basically, the text is first typed in with the text tool and then the text object is coverted to paths. Next paths are added into the polygon group of target tissue and filled with the same tissue colour. Below is the example of adding text to the epidermis tissue.

  1. Creat text paths

    Select “Creat and select text objects (F8)” from the left tool bar, drag a text box, and type epidermis. Click on the text object and convert it to path.

  2. Text fill and stroke

    Click on the text paths and fill them with the same colour of epidermis using “Pick colors from image (F7)” from the left tool bar. In the “Fill and Stroke” panel, set the stroke style.

  3. Add text paths to tissue group

    Click and cut the text, then double click epidermis (green) to enter the group.

    Paste the text anywhere then the text is inside the epidermis group as a “text group”. Move/resize the text group as expected. Right click and ungroup the text.

    All the letters are in the epidermis group as individual paths, which can be seen in the XML Editor.

To add a pointer, draw a rectangle and convert it to path, fill it with the same style as epidermis. Move, rotate and resize it as expected.

Clicking any of other tissues will group the text, pointer, and epidermis together, which can be confirmed by dragging epidermis to see they are moving as a whole.

Similarly, add text to label other tissues.

  1. Save the final SVG image, which is ready to use in the spatialHeatmap. The name of saved SVG image can only consists of letters, digits, underscores. E.g. “root_cross_final.svg” is acceptable while “root_cross_final(copy).svg” will give errors.

2.5 Troubleshooting

  1. Make sure the coloumn names are formatted in the right way.
  2. Make sure the target tissue names are exactly same between data matrix and SVG.
  3. Make sure all the paths, groups are placed in a large group (or layer) as a whole in the XML editor.
  4. Make sure a group does not contain any other groups except for the large group..

Reference

Dowle, Matt, and Arun Srinivasan. 2018. Data.table: Extension of ‘Data.frame‘. https://CRAN.R-project.org/package=data.table.

epilepsyresearch. 2017. “The Hippocampus: What Is It?” https://www.epilepsyresearch.org.uk/the-hippocampus-what-is-it/.

Geng, Yu, Rui Wu, Choon Wei Wee, Fei Xie, Xueliang Wei, Penny Mei Yeen Chan, Cliff Tham, Lina Duan, and José R Dinneny. 2013. “A Spatio-Temporal Understanding of Growth Regulation During the Salt Stress Response in Arabidopsis.” Plant Cell 25 (6): 2132–54.

Gentleman, R, V Carey, W Huber, and F Hahne. 2018. “Genefilter: Methods for Filtering Genes from High-Throughput Experiments.” http://bioconductor.uib.no/2.7/bioc/html/genefilter.html.

Morgan, Martin, Valerie Obenchain, Jim Hester, and Hervé Pagès. 2018. SummarizedExperiment: SummarizedExperiment Container.

Mustroph, Angelika, M Eugenia Zanetti, Charles J H Jang, Hans E Holtan, Peter P Repetti, David W Galbraith, Thomas Girke, and Julia Bailey-Serres. 2009. “Profiling Translatomes of Discrete Cell Populations Resolves Altered Cellular Priorities During Hypoxia in Arabidopsis.” Proc Natl Acad Sci U S A 106 (44): 18843–8.

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.